investment problem
A* search algorithm for an optimal investment problem in vehicle-sharing systems
Le, Ba Luat, Martin, Layla, Demir, Emrah, Vu, Duc Minh
We study an optimal investment problem that arises in the context of the vehicle-sharing system. Given a set of locations to build stations, we need to determine i) the sequence of stations to be built and the number of vehicles to acquire in order to obtain the target state where all stations are built, and ii) the number of vehicles to acquire and their allocation in order to maximize the total profit returned by operating the system when some or all stations are open. The profitability associated with operating open stations, measured over a specific time period, is represented as a linear optimization problem applied to a collection of open stations. With operating capital, the owner of the system can open new stations. This property introduces a set-dependent aspect to the duration required for opening a new station, and the optimal investment problem can be viewed as a variant of the Traveling Salesman Problem (TSP) with set-dependent cost. We propose an A* search algorithm to address this particular variant of the TSP. Computational experiments highlight the benefits of the proposed algorithm in comparison to the widely recognized Dijkstra algorithm and propose future research to explore new possibilities and applications for both exact and approximate A* algorithms.
Reinforcement Learning with Non-Exponential Discounting
Schultheis, Matthias, Rothkopf, Constantin A., Koeppl, Heinz
Commonly in reinforcement learning (RL), rewards are discounted over time using an exponential function to model time preference, thereby bounding the expected long-term reward. In contrast, in economics and psychology, it has been shown that humans often adopt a hyperbolic discounting scheme, which is optimal when a specific task termination time distribution is assumed. In this work, we propose a theory for continuous-time model-based reinforcement learning generalized to arbitrary discount functions. This formulation covers the case in which there is a non-exponential random termination time. We derive a Hamilton-Jacobi-Bellman (HJB) equation characterizing the optimal policy and describe how it can be solved using a collocation method, which uses deep learning for function approximation. Further, we show how the inverse RL problem can be approached, in which one tries to recover properties of the discount function given decision data. We validate the applicability of our proposed approach on two simulated problems. Our approach opens the way for the analysis of human discounting in sequential decision-making tasks.
Optimal consumption-investment choices under wealth-driven risk aversion
CRRA utility where the risk aversion coefficient is a constant is commonly seen in various economics models. But wealth-driven risk aversion rarely shows up in investor's investment problems. This paper mainly focus on numerical solutions to the optimal consumption-investment choices under wealth-driven aversion done by neural network. A jump-diffusion model is used to simulate the artificial data that is needed for the neural network training. The WDRA Model is set up for describing the investment problem and there are two parameters that require to be optimized, which are the investment rate of the wealth on the risky assets and the consumption during the investment time horizon. Under this model, neural network LSTM with one objective function is implemented and shows promising results.